第 3 课:ReAct 架构(Reason + Act)
在上一课的 Decision Engine 中,我们其实已经给出一套很好的容错代码。
如果你仔细观察会发现里面存在两个 循环:
-
第一个循环是处理
提问得到的回答和要求不符/存在错误,当前策略是把当前错误信息添加到历史记录中,并重新提问。超过最大尝试次数后终止提问,并给出保守回答。
在 函数
decide_with_retry中:for attempt in range(max_retries + 1): # 在有限的循环次数内,添加错误信息并尝试 # 如果上次发送得到的回答 经过检查存在错误,本次对话中加上上次的报错信息。 if err_msg: messages.append({ "role": "system", "content": f"Your previous output failed validation: {err_msg}. " f"Output ONLY one valid JSON object that matches the schema." }) # 获取回答 raw = call_llm(messages) # 错误类型一:回答内容str 不能严格按照 json 格式 try: obj = json.loads(raw) # 把 str 转化成 dict except Exception as e: err_msg = f"Invalid JSON parse error: {type(e).__name__}" continue # 错误类型二:回答内容存在逻辑错误,使用之前定义的函数检测 ok, reason = validate_decision(obj, tools_available) if ok: # 如果成功,则返回回答的结果 dict。包含action,tool,tool_input, final return obj else: # 如果失败,添加错误原因到对话中 err_msg = reason # 指数退避(简单版) time.sleep(0.4 * (2 ** attempt)) -
第二个循环是处理
工具调用失败的问题,但是其实没有用 for 循环写出来当前策略是把工具调用的错误信息写入
工具的调用结果 Observation中后, 在函数decide_with_retry中添加到聊天记录中,然后重新提问一次(没有循环尝试)。相当于只允许一次工具调用失败,其实也可以改成最大允许失败次数。if decision1["action"] == "tool_call": # 如果第一次得到的回答是”要用工具 tool = decision1["tool"] # 获取工具的名字 tool_input = decision1["tool_input"]# 获取工具需要的输入 # 获取工具执行的 返回(也就是 obeservation) try: if tool == "search": query = tool_input.get("query", "") obs = TOOLS["search"](query) else: raise RuntimeError("Unknown tool") except Exception as e: obs = f"[TOOL_ERROR] {type(e).__name__}: {e}" # 第二步:把 Observation 注入,要求模型基于 observation 输出 final decision2 = decide_with_retry( state={"goal": goal, "note": "Use the observation to answer."}, #注明使用 observation 回答 tools_available=tools_available, last_observation=obs ) print("Decision2:", json.dumps(decision2, ensure_ascii=False)) else: # 不需要工具,直接 final print("Final:", decision1["final"])
(一)ReAct 结构
这种检查容错机制 其实就是本节课要讲的 ReAct 架构,能够处理 模型回答 和 工具调用的失败,不会死循环。
ReAct = Reason(推理) + Act(行动) + Observation(观察)
也就是说:
- 每一步让 LLM 按照模板给出决策(Reason),并检查
- 按照 LLM 的决策调用工具(Act),获取反馈(Observation)
- 把 Observation 放入下一轮对话的记忆中。直到 finish
[State] → LLM(Decision) → [Action JSON]
↓
Validator / Guardrails
↓
Tool Executor / Error Handler
↓
Observation
↓
Update State
↺
(二)容错机制
1. 模型返回检查(Validator)
raw = call_llm(...)
obj = json.loads(raw)
validate_decision(obj)
Validator 至少做三层校验:
- 结构层:字段是否齐全
- 类型层:
isinstance(...) - 语义层:
- action 是否允许
- tool 是否存在
- action 与字段是否一致
2. 工具调用检查
工具失败 ≠ Agent 崩溃 工具失败 = 新的 Observation
try:
obs = tool(...)
except Exception as e:
obs = f"[TOOL_ERROR] {type(e).__name__}: {e}"
然后:
- 把这个 obs 作为 Observation 喂回模型
- 让模型决定:
- 换工具
- replan
- ask_user
- finish(降级)
千万不要做的事 ❌
- 工具异常直接
raise - 不告诉模型失败原因
- 悄悄重试无限次
3. 如何避免 Agent 无限循环
你至少要做 4 层防护:
- max_steps(硬上限)
for step in range(max_steps):
-
动作约束(不能乱来)
-
不允许 tool_call 无限重复
-
连续 replan 次数限制
-
-
终止状态变化检测
if state == last_state:
force_replan()
- 指数退避(你刚学过)
(三)思维链被截断的RePlan
有时候会出现这种典型表现:
- 连续 tool_call 但结果无用
- 重复同一个 action
- 输出越来越短 / 空
- Validator 连续失败
因此 ReAct 必须具备的“自救能力”。(Re-evaluate / Replan)
策略 1:显式 replan 动作
你已经允许了:
{"action": "replan"}
在 replan 时:
-
清空 observation
-
更新 state:
state["note"] = "Previous approach failed. Try a new plan."
策略 2:强制 meta 提示
在 system 里加一条:
If you are stuck or repeating actions, choose "replan".
策略 3:外部强制切断
if repeated_actions > 2:
return "Unable to proceed. Please clarify the goal."
(四)工具 schema
如果你仔细观察,会发现
现在这份代码里: 👉 并没有任何地方“规定”
tool_search的输入参数必须叫query👉 模型之所以输出了{"query": ...},完全是“猜的 / 习惯性的 / 概率性行为” 👉 这在工程上是 ❌ 不安全、❌ 不可控、❌ 迟早出 bug 的
因此需要明确声明工具 schema(给模型看的)
1. 提示词模板
TOOL_SCHEMAS = {
"search": {
"description": "Search the web for information",
"parameters": {
"type": "object",
"properties": {
"query": {
"type": "string",
"description": "Search keywords"
}
},
"required": ["query"]
}
}
}
把 schema 注入到 prompt(关键)
messages = [
{"role": "system", "content": SYSTEM_PROMPT},
{"role": "system", "content": f"Tool schemas:\n{json.dumps(TOOL_SCHEMAS)}"},
{"role": "user", "content": json.dumps({"state": state})}
]
这样模型才知道:
- search 这个工具
- 必须提供
query query是 string- 少了就不合法
2. Validator 同步升级
if action == "tool_call":
schema = TOOL_SCHEMAS[tool]["parameters"]
required = schema["required"]
for k in required:
if k not in obj["tool_input"]:
return False, f"Missing required tool_input field: {k}"
3. 完整代码
- Guardrails:动作空间 + 工具白名单 + 工具 schema
ALLOWED_ACTIONS = {"tool_call", "finish", "replan", "ask_user"}
TOOL_SCHEMAS = {
"search": {
"description": "Search the web for information. Use it when you need external facts.",
"parameters": {
"type": "object",
"properties": {
"query": {"type": "string", "description": "Search keywords"},
},
"required": ["query"],
"additionalProperties": False
}
}
}
ALLOWED_TOOLS = set(TOOL_SCHEMAS.keys())
- 工具实现(先用“假搜索”演示;你可替换为真实 web search)
def tool_search(query: str) -> str:
# Demo:这里返回固定内容,用于演示 Observation 注入
# 你可以替换成真实搜索:SerpAPI / 自建爬虫 / 你的 web.run 等
return f"[SEARCH_RESULT] query={query}\n- Agent = a system that uses an LLM to decide actions, can use tools, and updates state using observations."
TOOLS = {"search": tool_search}
- System Prompt:强约束
SYSTEM_PROMPT = f"""
You are an agent decision engine.
You MUST output exactly one JSON object. No markdown, no extra text.
Allowed actions: {sorted(ALLOWED_ACTIONS)}.
Allowed tools (only if action is tool_call): {sorted(ALLOWED_TOOLS)}.
Decision JSON Schema:
{{
"action": "tool_call|finish|ask_user|replan",
"tool": "string|null",
"tool_input": "object|null",
"final": "string|null"
}}
Rules (anti-hallucination):
- You MUST NOT fabricate any external facts.
- Tool results can ONLY come from an Observation provided by the system.
- If you need external info, choose action="tool_call" and specify tool/tool_input.
- When action="tool_call": final MUST be null.
- When action="finish": tool and tool_input MUST be null.
- If you are stuck or repeating, choose action="replan" or "ask_user".
Tool Schemas:
{json.dumps(TOOL_SCHEMAS, ensure_ascii=False)}
""".strip()
- LLM 调用
def call_llm(messages, model="gpt-4o", max_tokens=280, temperature=0.2) -> str:
headers = {"Authorization": f"Bearer {API_KEY}", "Content-Type": "application/json"}
payload = {
"model": model,
"messages": messages,
"max_tokens": max_tokens,
"temperature": temperature,
}
r = requests.post(CHAT_URL, headers=headers, json=payload, timeout=30)
r.raise_for_status()
return r.json()["choices"][0]["message"]["content"]
- Validators:严格校验 decision(结构 + 类型 + 语义 + tool_input schema)
def _validate_tool_input_schema(tool: str, tool_input: Dict[str, Any]) -> Tuple[bool, str]:
schema = TOOL_SCHEMAS[tool]["parameters"]
required = schema.get("required", [])
props = schema.get("properties", {})
additional = schema.get("additionalProperties", True)
# required fields
for k in required:
if k not in tool_input:
return False, f"Missing required tool_input field: {k}"
# type checks (minimal)
for k, v in tool_input.items():
if k not in props:
if additional is False:
return False, f"Unexpected tool_input field: {k}"
continue
expected_type = props[k]["type"]
if expected_type == "string" and not isinstance(v, str):
return False, f"tool_input.{k} must be string"
if expected_type == "object" and not isinstance(v, dict):
return False, f"tool_input.{k} must be object"
return True, "ok"
def validate_decision(obj: Dict[str, Any], tools_available: set) -> Tuple[bool, str]:
# 必要字段
for k in ("action", "tool", "tool_input", "final"):
if k not in obj:
return False, f"Missing key: {k}"
# action 合法
action = obj["action"]
if action not in ALLOWED_ACTIONS:
return False, f"Invalid action: {action}"
# tool_call 语义一致性
if action == "tool_call":
tool = obj["tool"]
if not isinstance(tool, str):
return False, f"tool must be string when action=tool_call, got: {tool}"
if tool not in tools_available:
return False, f"Tool not allowed/available: {tool}"
if not isinstance(obj["tool_input"], dict):
return False, "tool_input must be an object when action=tool_call"
ok, reason = _validate_tool_input_schema(tool, obj["tool_input"])
if not ok:
return False, reason
if obj["final"] is not None:
return False, "final must be null when action=tool_call"
else:
# 非 tool_call:tool/tool_input 必须为 null;final 必须为 str(ask_user/replan/finish 都给人类可读文本)
if obj["tool"] is not None or obj["tool_input"] is not None:
return False, "tool and tool_input must be null when action is not tool_call"
if not isinstance(obj["final"], str):
return False, "final must be a string when action is not tool_call"
return True, "ok"
- decide_with_retry:解析失败/验证失败自动纠错重试
def decide_with_retry(
state: Dict[str, Any],
tools_available: set,
last_observation: Optional[str] = None,
max_retries: int = 2,
model: str = "gpt-4o",
) -> Dict[str, Any]:
base_messages = [
{"role": "system", "content": SYSTEM_PROMPT},
{"role": "user", "content": json.dumps({"state": state, "tools_available": sorted(tools_available)}, ensure_ascii=False)},
]
if last_observation is not None:
# 关键:Observation 用 system 注入,告诉模型这是唯一可信外部事实来源
base_messages.append({"role": "system", "content": f"Observation:\n{last_observation}"})
err_msg = None
for attempt in range(max_retries + 1):
messages = list(base_messages)
if err_msg:
messages.append({
"role": "system",
"content": f"Your previous output failed validation: {err_msg}. "
f"Output ONLY one valid JSON object that matches the schema."
})
raw = call_llm(messages, model=model, temperature=0.2, max_tokens=280)
try:
obj = json.loads(raw)
except Exception as e:
err_msg = f"Invalid JSON parse error: {type(e).__name__}"
time.sleep(min(2.0, 0.4 * (2 ** attempt)) + random.random() * 0.1)
continue
ok, reason = validate_decision(obj, tools_available)
if ok:
return obj
err_msg = reason
time.sleep(min(2.0, 0.4 * (2 ** attempt)) + random.random() * 0.1)
# 降级
return {
"action": "ask_user",
"tool": None,
"tool_input": None,
"final": "I couldn't produce a valid tool/action plan. Please clarify your goal and constraints."
}
- Agent Loop:模型自行决定调用工具次数与停止
def run_cot_tool_agent(
goal: str,
max_steps: int = 6,
model: str = "gpt-4o",
) -> str:
"""
关键点:
- 每一轮:LLM 输出 decision JSON
- 若 tool_call:执 行工具,得到 Observation,再喂回 LLM
- 若 finish:返回 final
- 若 replan:更新 state/重置 observation,继续
- 若 ask_user:直接返回 final
"""
tools_available = set(TOOLS.keys())
state = {"goal": goal}
observation = None
for step in range(max_steps):
decision = decide_with_retry(
state=state,
tools_available=tools_available,
last_observation=observation,
max_retries=2,
model=model,
)
action = decision["action"]
if action == "finish":
return decision["final"]
if action == "ask_user":
return decision["final"]
if action == "replan":
# 最简单 replan:在 state 上写一个 note,告诉模型换策略
state["note"] = "Replan: change approach. If external facts needed, call a tool."
observation = None
continue
if action == "tool_call":
tool = decision["tool"]
tool_input = decision["tool_input"]
# 执行工具:把异常作为 Observation 返回给模型(不是 raise)
try:
obs = TOOLS[tool](**tool_input)
except Exception as e:
obs = f"[TOOL_ERROR] {type(e).__name__}: {e}"
observation = obs
continue
return "Failed: exceeded max_steps. Please refine the goal."
if __name__ == "__main__":
goal = "Explain what an agent is. If external facts are needed, use the search tool. Keep it concise."
answer = run_cot_tool_agent(goal, max_steps=6, model="gpt-4o")
print(answer)